End-to-end machine learning project working with Boston Housing dataset.

This project will follow the 8 steps outlined below. Dataset was chosen to model the project from Chapter 2 and imported from the sklearn library.

  1. Looking at the big picture
  2. Getting the data
  3. Discovering and visualizing the data to gain insights (EDA)
  4. Preparing the data for Machine Learning algorithms
  5. Selecting a model and training it
  6. Fine-tuning model
  7. Presentiing solution
  8. Launching, monitoring, and maintaining the system

1. Looking at the big picture

What is the business objective?

What performance measure am I using?

1. MSE 
2. RSME 
3. Decision Tree Regressor 
4. Ramdon Forest Regressor 

Define business objective

The purpose of project is to predict the median value of prices of houses in the Boston area, using features provided from the dataset.

2. Getting the data

Creating a test set

Which method is better?

3. EDA

4. Prepare data for Machine Learning algorithms

5. Selecting and training a model

Training model

Evaluating model's prediction accuracy

RMSE

Decision tree model

Better Evaluation Using Cross-Validation

SKlearn's cross-validation feature

6. Fine Tune Model

Evaluating model on test set

Results

Predicted median house value in Boston are is 12.95. The features that had the most impact on the prediction were found to be % lower status of the population (LSTAT) and average number of rooms per house (RM).